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Abstract. Bayesian modeling and analysis of the MEG and EEG modalities 
provide a flexible framework for introducing prior information complementary 
to the measured data. This prior information is often qualitative in nature, 
making the translation of the available information into a computational model 
a challenging task. We propose a generalized gamma family of hyperpriors 
which allows the impressed currents to be focal and we advocate a fast and ef- 
ficient iterative algorithm, the Iterative Alternating Sequential (IAS) algorithm 
for computing maximum a posteriori (MAP) estimates. Furthermore, we show 
that for particular choices of the scalar parameters specifying the hyperprior, 
the algorithm effectively approximates popular regularization strategies such 
as the Minimum Current Estimate and the Minimum Support Estimate. The 
connection between priorconditioning and adaptive regularization methods is 
also pointed out. The posterior densities are explored by means of a Markov 
Chain Monte Carlo (MCMC) strategy suitable for this family of hypermodels. 
The computed experiments suggest that the known preference of regulariza- 
tion methods for superficial sources over deep sources is a property of the MAP 
estimators only, and that estimation of the posterior mean in the hierarchical 
model is better adapted for localizing deep sources. 



1. Introduction 

The human brain contains approximately 10-'^"'^ exit able neurons whose resting 
state is characterized by a cross- membrane voltage difference. Electromagnetic sig- 
nals propagate as perturbations of this voltage difference, or action potentials, along 
the axons and are transferred across the synaptic gaps by neurotransmitters, creat- 
ing a post-synaptic potential in the receiving neurons. The post-synaptic potential 
may be relatively stable over a period of milliseconds and, being well localized, 
it can be modelled mathematically as current dipole. The neurons are organized 
in bundles, and when thousands of neighboring neurons are simultaneously in the 
post-synaptic excitation state, the net effect of the post-synaptic potentials gives 
rise to a localized current approximately parallel to the neuron bundle. This el- 
ementary impressed source current drives an Ohmic volume current in the brain 
tissue, and the net electromagnetic field can be registered on or outside the skull. 
Imaging of the neuronal activity based on the registered electric voltage (EEG) 
or on the magnetic field (MEG) has become a standard research tool in clinical 
and cognitive studies. Furthermore, when coupled to functional imaging methods. 
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MEG and EEG have a great potential to gather pertinent information about the 
couphng between neuronal activity and cerebral hemodynamics. 

The advantage of electromagnetic brain imaging modalities is their good tempo- 
ral resolution, about a millisecond, while their spatial resolution is limited by various 
factors, including weak signal-to-noise ratio, ambiguities in the source estimation 
due to the non-uniqueness of the inverse source problem [25 and uncertainties in 
the model. For example, while EEG suffers from lack of knowledge of the electric 
conductivity distribution and anisotropy of the head, MEG is known to be less 
affected by that [S', "3T. For both modalities, the anisotropy of the white matter 
may result in a strong bias of the volume currents, an effect that should be taken 
into account in a detailed model. 

The properties of the solutions to electromagnetic inverse source problems de- 
pend on the a priori information implemented in the algorithm. One piece of 
information that may certainly improve the spatial resolution, if properly incorpo- 
rated in the algorithm, is the focal nature of the impressed currents. In fact, since 
the electric and magnetic fields outside the skull depend linearly on the impressed 
currents, but the component of the source belonging to the significantly nontrivial 
null space of the forward map has no effect on the data, the determination of a 
physiologically meaningful solution must be based on complementary information. 
The implementation of feasible selection criteria has led to different solutions. For 
example, setting the null space component to zero gives the minimum norm esti- 
mate (MNE) [12 , while minimization of the current, or £^ norm of the source results 
in the minimum current estimate (MCE) [31]. Solutions such as the Low Resolu- 
tion Electromagnetic Tomography (LORETA) [22 are based on the assumption of 
smoothness of the source. In [13 , the localization is pursued by local multipole 
expansions of the fields. Low-dimensional parametric models, for example those 
using only a few current dipoles [8l [191 E] ^ lead to localized solutions but limit the 
number of possible source configurations. Prior anatomical information such as the 
location of the sulci has been shown to improve the performance of the localization 
I2j. 

Bayesian methods are widely used in EEG/MEG to implement pertinent prior 
information such as anatomic constraints or functional information based on other 
imaging modalities [U [TH [27l [30]. An additional level of flexibility to Bayesian 
modelling is provided by hierarchical models that allow uncertainties in the prior 
model itself [TJ [211 US]- In particular, the model hierarchy is a powerful tool for 
including prior information that is qualitative rather than quantitative [7 . 

The connection between classical regularization methods and Bayesian Maxi- 
mum A Posteriori (MAP) estimates is well known. In particular, the Gaussian 
models lead to a MAP estimate that coincides with the standard Tikhonov regu- 
larized solution with a quadratic penalty, while non- Gaussian models are needed 
for non-quadratic penalties that are often more useful but computationally more 
challenging. In this work, we construct a conditionally Gaussian hierarchical para- 
metric model that has the computational advantages of Gaussian prior models but 
leads to a rich class of MAP estimators that have the desirable qualitative prop- 
erties of numerous commonly used non-Gaussian models. Using the conditional 
normality, we construct a fast, efficient and simple MAP estimation algorithm, and 
show that with proper choices of the few model parameters, the algorithm can be 
interpreted as a fixed point iteration for solving the minimum norm estimate, the 
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minimum current estimate and, more generally, the minimum £^ estimate, while 
with a different choice, the algorithm approximates the minimum support estimate 
(MSE) [20 . Hence, our approach puts these methods in a unified framework. 

The estimation algorithms require the solution of large linear systems, the sys- 
tem size depending on the discretization of the model. When a realistic three- 
dimensional model is used as in the present paper, iterative linear systems solvers 
are indispensable. It is common practice to improve the performance of the itera- 
tive solvers by preconditioners [24]. Recently, the authors introduced the concept 
of priorconditioner [6 , a preconditioner that is based on the prior model rather 
than on linear algebraic properties of the system matrix. It is shown in this ar- 
ticle that the priorconditioning based on the conditionally Gaussian hierarchical 
models yields an effective implementation of adaptive regularization similar to the 
FOCUSS (FOCal Underdetermined System Solver) algorithm [10 . 

A well-known feature of many regularization based source localization algo- 
rithms, including the minimum norm and minimum current estimates, is their ten- 
dency to favor surface sources, leading sometimes to a gross misplacement of deep 
sources. To suppress this bias, different weighting methods have been proposed to 
favor deep sources over the shallow ones, see, e.g., [16 for a recent account. From 
the statistical point of view, however, a depth dependent prior covariance which 
compensates for the preference of the likelihood towards superficial sources appears 
rather dubious, since there is no reason to believe a priori that a deep source should 
have higher variance than a superficial one. In this article, we confirm with numeri- 
cal experiments that surface biasing may be a property of the MAP estimate, while 
Markov Chain Monte Carlo (MCMC) sampling based posterior mean estimates may 
localize better deep sources without requiring weight compensation. This result re- 
inforces the concept, well known to the Bayesian statistics community, that the 
mode of the posterior distribution may do a poor job at representing the distribu- 
tion, demonstrating that the Bayesian formulation of the inverse problem is much 
more than "yet another way to regularize an ill-posed problem", as is sometimes 
incorrectly stated. 

Preliminary testing of the inversion algorithms and the MCMC sampling is done 
using a simple geometry with constant conductivity, modelling the distributed cur- 
rents by a field of current dipoles. Subsequently, numerical experiments with a 
realistic three dimensional head model, with different electric conductivities in the 
scalp, skull, cerebrospinal fiuid and the brain tissue are also presented. The volume 
current calculations are carried out with a finite element algorithm developed for 
this purpose. The distributed currents are represented by Raviart-Thomas elements 
that are a reasonable substitute for singular dipoles in the FEM context. 

2. Forward model, EEG and MEG 

We start by introducing the notations to be used in the sequel and review some 
basic facts concerning the forward model, which is based on the standard approach 
using the quasi-static approximation of Maxwell's equations We denote by 

D C the head with boundary surface S and scalar conductivity distribution 
a > 0. If we let J denote the impressed current density in the total current 
density, consisting of the impressed current and the Ohmic volume current, is Jtot = 
J + crE, where E is the electric field induced by the current. Under the quasi-static 
approximation, we may assume that E is conservative, E = —Vu. Neglecting the 
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electric displacement current in the Maxwell-Ampere equation, the total current 
density is divergence free, leading to the Poisson equation for the electric potential, 

du 



(1) V-(aVii)=V-J, 



0, 



where the Neumann boundary condition follows from the assumption that the con- 
ductivity vanishes outside D. The magnetic field B outside the head induced by 
the total current is, according to the Biot-Savart law, obtained as 

(2) B(x) = g|^J,otX^^dy, xeR^D. 

The computation of the magnetic field therefore consists of two steps: the numerical 
solution of the Poisson equation ([T]) to find u and thus the total electric current 
density Jtot = J — cfVu^ and the computation of the integral yielding the magnetic 
field B ([2|. Each of these steps poses computational challenges. 

The solution of the boundary value problem ([T]) can be formally expressed in 
terms of the Neumann Green's function of the diffusion operator V • aV, 



(3) u{x) = / g^{x,y)V ■J{y)dy. 



-b 



We assume that the electric potential is measured at locations x^, 1 < ^ < L on 
the surface S of the head. The approximation of the impressed current by a finite 
linear combination J = Y^^=i^k]k of basis currents j/c, 1 < < i^, leads to a 
discrete model 

ui = u{xi) = y2\ gN{xi,y)\/ - jk{y)dy] ak 
k=i ^ 

K 

= ^M,>,, 1<^<L, 

k=i 

where the matrix G R^^^ is the electric lead field matrix. Similarly, assume 
that outside the head at points x^, 1 < n < A/", the projection of the magnetic 
field in given directions is measured. Substituting the expression ([3| in the 
Biot-Savart law and representing the impressed current in the basis j^, we have the 
discrete model 



K 



Xn-y , 
X- -dy 

Fn - yr 



K 

k=l 



where the matrix G R^^^ is the magnetic lead field matrix. 

The EEG inverse source problem is to estimate the vector a G R^ from the 
noisy observations of the voltage potential u G R^, while in the MEG inverse source 
problem the data consist of the noisy observations of the magnetic field component 



vector V G R^. 



CONDITIONALLY GAUSSIAN HYPERMODELS FOR CEREBRAL SOURCES 



5 



In our model, we assume that the current elements }k that constitute the basis 
for the distributed current are either dipoles or dipole-like vector- valued elements. 



3. Imaging in the Bayesian framework 

In the Bayesian framework, inverse problems are recast in the form of statistical 
inference [6l [15] . The lack of information about any of the quantities appearing in 
the formulation of the problem is expressed by modelling them as random variables, 
and the available information is encoded in the probability densities. 

In the electromagnetic inverse source problem, the goal is to estimate the coef- 
ficient vector a from the observations 

b Ma + e, 

where M is either the electric or magnetic lead field matrix and e is noise that, 
for simplicity, is modelled here as additive. Although not necessary, we assume 
that the noise is white Gaussian with known variance a^, which we assume known, 
leading to a likelihood model of the form 

7T{b I a) (X exp (^-^11^ - Maf 
We consider prior models that are conditionally Gaussian and of the form 



7rprior(<^ \ 0) (xexp \ -^\\D^ 



a|p-i^log^,j 



where Dq is a diagonal matrix, Dq = diag(6>i, . . . ^Ok)^ and the logarithmic term 

— 1/2 

comes from normalizing of the prior density by the determinant of Dg ' . The 
posterior density conditional on d is, by Bayes' formula. 



7r(a \ h^S) (X 7rprior(<^ | 0)7r{b \ a) 

exp I -^||6-Ma||2 ^ 



Assuming the variance vector known and fixed, the MAP estimate for a, 

c^MAP = argmin (^^11^ - Maf + ^WD^^^^af 

is the classical Tikhonov regularized solution with a penalty defined by the diagonal 
matrix D. It is known that if has equal entries, this solution is smeared out 
even if the data corresponds to a focal input. To improve the localization, non- 
quadratic penalties, for example the -^^-norm, p < 2, of the coefficient vector a, 
have been proposed. Here, we take a different approach assuming instead that the 
variance vector is unknown, and thus making its estimation a part of the inverse 
problem. The variance vector is modelled as a random variable, and available a 
priori information concerning it is expressed by a hyperprior 7rhyper(^)- The prior 
probability density of the pair (a, 0) is then 



^prior 
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and, according to Bayes' formula, the posterior probability density, conditioned on 
the observation b alone, becomes 

7r(a,6> I b) ex TThyper W7rprior(<^ | 0)7r{b I a). 

This implies that in the present formulation we need to estimate both a and its 
prior variance vector 0. 

4. Hypermodels: MCE, minimum £p and beyond 

Prior densities require quantitative information about the unknown, e.g., an 
estimate for its mean and its dynamical range, which is in turn related to the prior 
variance. The flexibility of hypermodels lie in their ability to import qualitative 
information into the estimation process, see O [71 SI [6] . In this article, we assume 
that the only a priori information concerning the impressed current is that it should 
consist of few focal sources. In statistical terms, such information can be expressed 
by the following three statements: 

(1) Nearby source current elements, should not be, a priori, mutually depen- 
dent, to favor the focality; 

(2) No location preference for the activity should be given a priori; 

(3) Most of the dipole-like sources should be silent, while few of them could 
have large amplitude. 

To encode these conditions into the hyperprior, we observe first that stochastic 
dependence among the variances Ok would couple the dynamical ranges of the cor- 
responding coefficients a/e. Therefore, it is reasonable to assume that the variances 
Ok are mutually independent. On the other hand, without prior knowledge about 
the location of the active sources, it is also reasonable to assume a priori that the 
variances are equally distributed. Furthermore, we want to allow the distribution 
of the variances to favor small values while permitting rare large outliers which cor- 
respond to large amplitude of the source. Among the wealth of distributions that 
meet these requirements, we choose the parametric family of distributions, known 
as the generalized gamma distribution^ Ok ^ GenGamma(r, ^o), defined as 

TThyperiO) = TThyper T, 6>o) 

(4) cc n^f-iexp(-p)=expf-f^| + (r/3-l)f;iog^,y 

fc=i ^ ^oJ \ k=1^0 k=l I 

In particular, we remark that by choosing r = 1, we have the gamma distribution, 
9k ^ Gamma(/3, ^o) = GenGamma(l,/3,^o)> 

7rhype.(^) cc n ^f-^exp (-^) = exp (- ^ + (/3 - 1) f^log^fc) , 

k=i V 0/ Y J 

while with r = —1, we obtain Ok ^ InvGamma(/3, ^o) = GenGamma(— 1, ^o) 
which is the inverse gamma distribution^ and 

^hyper(^) cc ([ O-^-'exp (-^^] = exp | - (/? + 1) ^ogO^ . 

k=l \ k=l k=l / 

For gamma and inverse gamma distributions, the parameters P and are re- 
ferred to as shape parameter and scaling parameter^ respectively. A discussion of 
the similarities and differences of these two distributions, in particular with regard 
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to the frequency of occurrence and value of outliers, can be found in [5 , where 
we also propose the following Iterative Alternating Sequential (IAS) algorithm for 
computing the MAP estimate, (o^map^^map) = argmax{7r(a, ^ | 6)}: 
IAS MAP estimation algorithm: 

(1) Initialize = 0^ and set i = 1; 

(2) Update a by defining a* = argmax{7r(a | b^0'^~^)}; 

(3) Update by defining = argmax{7r(^ | 6, a*)}; 

(4) Increase i by one and repeat from 2. until convergence. 

In the algorithm, following Bayes' formula, the conditional posterior probabilities 
are obtained as 7r(a | h^0^~^) ex 7:{a^9^~^ \ b) and 7r{0 \ 6, a*) ex 7r(a*,^ | 6), i.e., 
alternatingly the vector a or the vector in the expression for the posterior density 
set to the current estimate. 

The IAS algorithm has been previously applied to image and signal deblurring 
[SI mil], and it is found to give a fast and stable algorithm that is easy to implement. 

Before discussing how to organize the computations for different choices of the 
hyperprior parameter r in Q, pointing out connections with known algorithms as 
appropriate, we want to point out a difference between our approach and previously 
proposed ones. 

In the literature of Bayesian hierarchical models, the gamma distribution is often 
suggested as a hyperprior for the precision, or inverse of the variance, because of 
its conjugacy property. This corresponds to the inverse gamma distribution for the 
variance. The conjugacy property is useful, e.g., when variational Bayes methods, 
or variants of the closely related EM algorithm, are used and analytic marginal 
integrals are desired (Rao-Blackwellization), see, e.g., ^7\. Relevant references for 
the MEG problem are [26l |33] . Interestingly, in [26 the gamma hyperprior for the 
precision was suggested but the connection with the regularization methods was 
not pointed out. We emphasize that our approach neither needs the conjugacy, 
nor does it take advantage of it. In fact, as we will show below, the IAS algorithm 
yields fast, efficient and explicit estimators with a large range of parameter values 
corresponding to non-conjugate models. The generalized gamma distribution is 
chosen here solely on the basis that it allows rare outliers. 

Gamma distribution and Minimum Current Estimate. Consider the poste- 
rior density of the pair (a, 0) when the hyperprior is the gamma distribution 



To solve the first maximization problem in the IAS MAP estimation algorithm, 
set 6 = 6>*~^ and let the updated be the minimizer of the negative of the log- 
posterior. 
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which is the least squares solution of the linear system 



(6) 



^-1/2 



a 







Subsequently, setting a = a\ the updated value of can be found by differentiating 
the log-posterior with respect to 6 and setting the derivative equal to zero. The 
resulting equation is a second order equation 

ia|/^2 _i/0^+ ri/0j =0, 71 = 13- 3/2, aj = a) , 

for the positive root, whose analytic expression is 

0'j = \9o{v+^V^+2ay9o). 



In particular, if we let 77 = 0, we have O'j = ly^o/^, and, by substituting this in 
([5]), we obtain 



lin I - 



argmin ( :^\\b - Ma|p 



1 ^ al \ 



y2cr2" " 

which is also a fixed point iterate of the minimization problem whose solution is 
the Minimum Current Estimate (MCE) [31] 



QfMC = argmin \\b - Maf + ^ V , ^ = V Z"^ 



2 



Observe that, by choosing jS > 3/2 in the hyperprior, we avoid the problem of 
dividing by components a]r^ near or equal to zero. This hence provides a natural 
regularization method for solving the Minimum Current Estimate problem. 

Generalized gamma distribution and estimates. The choice of the hy- 
perprior from the family of generalized gamma distributions leads to the posterior 
model 

7r(a,6> I 6) (X exp^ - ^||^ - Ma|p 

- I \\D;'^'ar -^toi+U-Df: log e,] . 

k=l ^ ^ k=l / 

The updating of a given the current value 6>*~^ requires solving the system ^ 
as in the case of gamma hyperprior, while the updating formula for the variance 
parameter changes. Setting the derivative of the logarithm of the posterior density 
equal to zero leads to the algebraic equation 

^a^e] - rey^lBl + (r/3 - 3/2)/^, = 0, a, = a). 



This equation does not have, in general, a closed form solution, although when 
rP = 3/2 the solution is simply 
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As in the case of the gamma distribution, after substituting this solution into 
the objective function in ^ we notice that the updated is a fixed point iterate 
of the ^^-penahzed regularization problem, 




ap = argmin ||6 - Ma|p + J V , 5 = 2a^ 



2^x1/(^+1) 



with r = p/{2 — p). It is known that when < p < 1, i.e., < r < 1, this solution, 
like the MCE solution, tends to minimize the support of the estimated current, thus 
yielding a good localization of the focal activity. On the other hand, letting r ^ oo, 
or, equivalently, p ^ 2, the MAP estimate approaches the Tikhonov regularized 
solution with a quadratic penalty. The intermediate case 1 < p < 2 is related to 
the analysis presented in fT^. 

Inverse gamma distribution and Minimum Support Estimate. The poste- 
rior density for the inverse gamma hypermodel is of the form 

7r(a, 6> I 6) cx exp 1 - ^||6-Ma|p 



k=l ^ \ / k=l 



Once again, the updated value a* is found by solving ^ keeping fixed to the 
current value while Oj is the zero of the derivative of the logarithm of the 

posterior which satisfies 

with K = P -\- 3/2, and aj = a* , and can be expressed as 

'1 



By interpreting this algorithm as a fixed point step of a regularization scheme 
with a nonlinear penalty term, we can reformulate it as the following minimization 
problem 

a = argmin (^\b - Maf + ,5^ {a!y+29o ) ' = 4'^'^'' 

whose solution is the Minimum Support Estimate (MSE) [20]. In [6l [5j [7], the 
authors have shown that, in the context of traditional image processing, the corre- 
sponding penalty is related to the Perona-Malik functional [23 . 

Higher order inverse gamma distributions. The gamma and inverse gamma 
distributions are standard models in hierarchical Bayesian methods and the MAP 
estimates correspond to known regularizing schemes. The combination of the gen- 
eralized gamma distribution and the IAS algorithm provide models that are com- 
putationally tractable and lead to new estimators. As an example, if we choose 
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r = —q^ where > 1 is an integer, the posterior distribution is 
7r(a,6> I 6) (X exp^ - ^11^ - Ma\\^ 

k=l ^ \ / k=l 

The algebraic equation for updating 0^ is 

i.e., Ok is a positive root of a polynomial of order and can be computed in a stable 
way by considering the companion matrix. 



5. Hyperpriorconditioners 

Direct computation of the least squares solution of the linear system ([6|, which 
yields the updated a, becomes prohibitively slow when the dimensionality of the 
problem is large, as is the case in the applications that we are considering here. 
Iterative methods are the method of choice for the solution of large scale linear sys- 
tems. In the case of linear discrete ill-posed problems, this approach is particularly 
attractive because of the inherent regularizing properties of iterative solvers when 
equipped with suitable stopping criteria. Iterative linear system solvers start from 
a given approximate solution and proceed to determine a sequence of improved 
approximate solutions. 

Assume, for the moment, that in the process of updating a we introduce the 

— 1/2 

change of variable w = D^^ ' ol where is the current value for 6>, and express the 
solution of the optimization step in the form 

(7) «;+ = argmin - MD.^f «;f + \\\wf^ , = D^f 

We remark that while in the statistical framework a suitable choice of the matrix 
Dq^ makes the change of variable equivalent to whitening the random variable a, 
in the context of iterative linear systems solvers, this transformation amounts to 

1 /2 

preconditioning. Since in our problem the choice of the matrix Dq is dictated 
by the selection of the prior, following [6] we refer to it as priorconditioner, to 
emphasize the connection between the numerical performance and the statistical 
setting. 

After the change of variables, the solution of the least squares problem ^ for 
updating a coincides with the standard Tikhonov regularized solution for solving 
the preconditioned linear system 

(8) G-^MDy^w = (j-H, a = D^^w. 

It has been shown in the literature that iterative Krylov subspace methods such as 
CGLS (see [24 ) with early stopping of the iterations may give results of comparable 
quality to Tikhonov regularization but are computationally much more efficient. 

The introduction of a suitable right priorconditioner, which in the Bayesian 
framework is related to the prior of the unknown of primary interest, to bias the 
iterates towards a desirable subspace, has been shown to improve the quality of the 
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computed solution, in particular when the number of iterations is limited by either 
high computational costs or a large noise level in the data [6]. 

In general, statistically inspired preconditioners, which convey into the linear 
system solver prior beliefs about the solution, are constructed from the covariance 
matrix of the prior. In the application considered here, however, the prior is a 
function of unknown parameters, whose distribution is, in turn, given by the hy- 
perprior. Since the parameters of the prior are themselves part of the estimation 
problem, they are updated at each iteration step. To ensure that the solution of the 
minimization problem (pf uses the most up to date information about the problem, 

1/2 

as soon as Oc becomes available, we update the priorconditioning matrix D^^^ , then 
proceed to compute a new estimate of a. 

We remark that the idea of updating the preconditioner to take advantage of 
newly acquired information about the linear systems was already proposed in the 
context of a Flexible Generalized MINimal RESidual (FGMRES) scheme [24], al- 
though the motivation for the updating were quite different. In the Bayesian par- 
adigm we can view priorconditioning in the context of hypermodels as priorcondi- 
tioning conditioned on the present estimate of the prior parameters. 

Finally, it is of interest to notice that the alternating updating scheme where the 
least squares problem is solved with preconditioned iterative methods ([8| includes as 
a special case an algorithm previously proposed in the literature, based on adaptive 
weighting of the ^^-norm. In fact, when the hyperprior is the gamma distribution 
with = 3/2, the IAS is essentially the FOCUSS algorithm discussed, e.g. in [TOl. 
While in the FOCUSS algorithm the regularization is obtained by passing from ([8| 
to the normal equation corresponding to ([7|, here we advocate regularization by 
truncated iteration. The connections between the minimum current estimate and 
FOCUSS from the empirical Bayesian point of view have been pointed out also in 

6. MCMC AND REGIONS OF INTEREST 

A great advantage of the Bayesian approach over different regularization schemes 
is that starting from the posterior density, we can compute a number of different es- 
timates and furthermore quantify their reliability. The uncertainty quantification, 
however, usually requires sequential sampling techniques which are computation- 
ally considerably more intensive than optimization based computation of single 
estimates, in particular when applied to a detailed three dimensional model. 

Various dimension reduction methods have been proposed in the literature to 
make the MCMC sampling viable. A common approach is to restrict the source 
sampling either to the surface of the brain or to a thin cortical layer, see, e.g., [26 . 
In MEG, a further reduction of the dimensionality of model may be achieved by 
restricting the sampling to the cortical regions with a non-radial normal vector, 
see, e.g., [1 . These model reductions are not applicable for us, since we consider 
both EEG and MEG and the possibility of recovering deep sources with the MCMC 
sampling. 

Fortunately, in applications where we are interested in local sources, it is often 
sufficient to restrict the sampling to a much smaller Region Of Interest (ROI), 
around the potentially active area. The selection of the ROI can be based on prior 
information about the expected activity: for example, the primary response to a 
visual stimulus is expected to occur in the occipital lobe. Alternatively, the ROI 
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can be selected around an estimated focus of activity. Note that the concept of 
ROI does not exclude the possibilities of restricting the sampling to a portion of 
the cortical layer, as is done in the cited articles, or of sampling over the whole 
brain, if the computing resources are not an issue. 

Once the ROI has been identified, we collect the indices of the source basis vectors 
j/c whose support is in the ROI in the index vector /roi, and the remaining ones in 
the vector Iq. We then partition the vectors a and accordingly, introducing the 
notation a^oi = ^(/roi), c^o = Q^(^o), ^roi = ^(^roi) and Oq = 0{Io). 

We can now perform MCMC sampling over the ROI, fixing the outside cur- 
rent values and prior variance to prescribed values ao, ^o, using the conditional 
distribution, 

7r(aRoi,^ROi I ao,Oo,b) ex 7r([aRoi, Q^o], [^roi, ^o] | b). 

Evidently, ao = is the most natural choice, corresponding to the assumption that 
no activity outside the ROI appears. The MCMC algorithm that we propose is an 
independence sampling method, where aRoi and ^roi are updated sequentially by 
a procedure analogous to that of the IAS estimation algorithm. The updating of 
aRoi is done using the conditional normality of the posterior, while the updating 
of ^Roi is done via a full scan Gibbs sampler [6l [151 HZI • 
MCMC sampling over ROI: 

(1) Initialize Q^rqd ^roi set z = 1. Define M, the desired sample size. 

(2) Draw o^rqi fro^i the Gaussian density 

7r(aRoi I 6>^o\'^o,^o,^) cx 7r([aRoi,ao], [^Ro\'^o] | b). 

(3) Draw 6>roi componentwise with a Gibbs sampler from the density 

7r(6>Roi I aRoi,ao,^o,^) cx 7r([aRoi, o^o], [^roi, ^o] | b). 

(4) If i = M, stop; otherwise increase i by one and repeat from 2. 

In the practical implementation of the algorithm we update OfRoi by defining a 
matrix G and its partitioning. 



G 



[ Groi Go 



where Groi contains the columns of G with indices in /roi and Go the remaining 
columns. The updated o^rqi obtained by solving, in the least squares sense, the 
linear system 



GroiQ^roi 







Goao + w A/'(0,/). 



The updating of ^/c, /c G /roi, is performed by drawing from the one-dimensional 
probability density 

7r,{0,) oc exp f-^ - f + (rP - ^ log0, 



20k J \ 2 

by the inverse cumulative distribution method [6l[T5]. Hence, the sampling tech- 
nique just outlined takes advantage of the conditional normality of the prior and 
of the mutual independence of the variances, similarly to the IAS algorithm. 
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Figure 1. MAP estimates of the current (top row) and of the 
variance of the prior (bottom row) at different depth in the ROI. 
The hyperprior is the gamma distribution. The true current dipole 
used for generating simulated data is shown as a hahow arrow, and 
the ROI is marked on the superficial layer. 
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Figure 2. MAP estimates of the current (top row) and of the 
variance of the prior (bottom row) at different depth in the ROI. 
The hyperprior is the inverse gamma distribution. The true current 
dipole used for generating simulated data is shown as a hallow 
arrow, and the ROI is marked on the superficial layer. 



7. Computed experiments 

In this section we apply the methodology derived above to inverse source prob- 
lems by first considering an example with a simplified planar geometry using a 
traditional singular dipole model, then applying it to a realistic conductivity model 
for the human head. In the latter case, both the MEG and EEG modalities are 
considered. Since the finite element method (FEM) is needed to solve the poten- 
tial distribution, we use FEM basis functions also for representation of the current 
density. 

MEG in planar geometry. Consider a half space as a local model for the hu- 
man head. The half space model is particularly appropriate to illustrate the depth 
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Figure 3. Posterior mean estimates of the current (top row), of 
the variance of the prior (center row), and of the variance of the 
norm of the current estimated over the sample (bottom row). The 
hyperprior is the gamma distribution. 
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Figure 4. Posterior mean estimates of the current (top row), 
of the variance of the prior (center row) and the variance of the 
norm of the current estimated over the sample (bottom row). The 
hyperprior is the inverse gamma distribution. 



resolution with different hypermodel parameters and with different statistical esti- 
mators. 

The magnetic field component perpendicular to the surface is recorded in a 
rectangular array of observation points X£ above the surface z = 0. We represent 
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4 4 
X 10 X 10 



Figure 5. Sample histories of the components Oj and ||qj|| for 
the gamma hyperprior (top row) and the inverse gamma (bottom 
row), where j is the index of the component of maximum value of 

OcM- 



the current density as a linear combination of point-wise current dipoles, 

K K 

J(^) = '^Ky- yk){oLl^i + oil^2) = ^5{y- yk)cik^ 



where ei and 62 are orthogonal basis vectors parallel to the plane z = 0. In this 
example we ignore the volume currents, which is tantamount to setting <j = 0. 
More generally, assuming a conductivity density that depends only on the depth 
[25) . we arrive at a particularly simple magnetic lead field model. 



Mo 
47r 



K 2 

EE 

/c=l j = l 



63 • (e^ X {xj - yk)) 



ai, or b=[ ] 



with a = G M?^ . When writing the conditional prior, we identify the 

variances and of the two mutually perpendicular dipoles whose locations 
coincide, and the unknown variance ^1 ~ ~ is a vector of length K. The 
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posterior density tiien becomes 

7r(a,6> I 6) oc exp^ - - Ma\\^ 

with llq/clP = (o^fc)^ + We use the IAS algorithm to calculate the MAP 

estimate, and MCMC sampling over the ROI to estimate the CM and obtain a 
measure of its uncertainty. 

The geometry of our model consists of a rectangular array of 10 x 10 vertical 
magnetometers 2 cm above the half space, with a distance between adjacent mag- 
netometers of 1 cm. The dipoles are located below the magnetometers in nine 
horizontal layers, each containing a 10 x 10 rectangular array of dipoles. The depth 
of these layers varies from zero (superficial sources) to 4 cm, with a distance between 
the layers of 0.5 cm. 

Since the MAP estimation algorithm is tantamount to fixed point iteration with 
localizing penalties, we expect good performance at detecting focal sources of known 
depth. It is well known [11, 16, 31 that when the depth of the source is unknown, 
the minimum current and minimum norm estimates due to their tendency to bias to- 
wards superficial sources may lead to gross misplacements of the deep focal sources. 
We expect the same behavior from the MAP estimates of our hypermodel. We test 
this by generating synthetic data in which a single dipole source is placed 3.5 cm 
below the surface of the half space. The standard deviation in the likelihood model 
was assumed to be 5% of the maximum noiseless signal. In this simulation, we did 
not add artificial noise to the data, since we are only interested in the model bias, 
not in the noise sensitivity. 

The MAP estimates for the dipole fields as well as for the prior variance with 
model parameter values r = 1 (gamma) and r = — 1 (inverse gamma) are shown in 
Figures [l]-|| When r = 1, we use the values 6>o = 10 ^ and = 3 for the scaling 
parameter and the shape parameter, while when r = — 1, we set 6>o = 10~^ and 

= 3. In both cases we perform 15 iterations with the IAS algorithm. 

As expected, both hypermodels favor superficial sources, with a relatively good 
localization in the horizontal direction, i.e., the MAP estimate of the activity is 
above the true source. The major differences between the two hypermodels are in 
the convergence rate and in the focality of the MAP estimate. With the inverse 
gamma hypermodel the iterative algorithm converges faster than with the gamma 
hypermodel, in particular for non-superficial sources, and seeks to explain the data 
with fewer active superficial dipoles. 

To reduce the dimensionality of the sampling space, we select the ROI to be 
a cylinder with a 6 x 6 cm^ square base around the estimated superficial focal 
activity, shown in Figure [l] - |2j containing 9 x 6 x 6 = 324 dipoles. For each 
hyperparameter model we generate a sample of size M = 50 000, conditional on the 
currents vanishing outside the ROI, and calculate estimates of the posterior means 
of the vectors a and ^, o^^m ~ Jd X^i^i j — 1^ 2, and ^cm = ^fLi 
of the posterior variance of the dipole amplitudes, Var(||q/c||) = ^fLi {(<^^'* — 
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Table 1. The domains of the head model and respective conductivities. 



Layer 


Conductivity 


scalp 


0.33 


skull 


0.0042 


cerebrospinal fluid 


1 


brain 


0.33 



Figure [3] displays the plots of the posterior mean of the current and the estimates 
of the prior variance, and the posterior variance of the amplitude with the gamma 
hyperprior, while Figure [4] shows the analogous results for the inverse gamma hy- 
perprior. The results with the two hyperpriors are qualitatively very different. The 
posterior mean of the current for the gamma hyperprior model is biased towards 
the surface and is very similar to the MAP estimate, while the inverse gamma 
hypermodel has a good depth resolution, with an error between the true and the 
estimated source depth of 0.5 cm. The observation that the gamma hyperprior 
leads to a posterior density that is qualitatively closer to the Gaussian prior, whose 
mean and maximum coincide, is in line with the fact that, as r ^ oo, the MAP 
estimate approaches the minimum norm, or Tikhonov regularized, solution. The 
latter corresponds to an i'^ prior model. 

The sample histories of single components ||qj || = + and Oj reveal 

an interesting feature of the posterior density. Figure [5] shows the sample histories of 
the components corresponding to the maximum value of ^CM- The sample histories 
with the gamma hyperprior exhibit good mixing, while those corresponding to the 
inverse gamma distribution seem to suggest a bimodal posterior density. 

Note that the posterior mean of the prior variance is in good agreement with 
the posterior variance of the current amplitude, implying that the reliability of the 
posterior mean current in this geometry could be assessed directly from the mean 
of the variance parameter 0. 

EEG/MEG in realistic geometry. To validate the hyperprior models in a more 
realistic geometry, we performed EEG and MEG tests with a realistic human head 
model, based on MRI data. We assume that both electric and magnetic fields are 
measured outside the skull at 31 different locations, as shown in Figure [6) This 
model is partitioned into four domains of constant electric conductivity: scalp, 
skull, cerebrospinal fluid (CSF), and brain, as illustrated in Figure [6j Anisotropics 
in the brain as well as possible electroconductive differences between the gray and 
the white matter are ignored. 

The electric and magnetic lead field matrices for this setup were constructed 
using the complete electrode model and a Finite Element Method (FEM), described 
in the Appendix, under the assumption that all electrode contact impedances are 
equal to one. The FEM mesh was generated in two stages. The meshes for the 
skull and the brain domains were generated first using triangular elements. The 
meshes for the scalp and CSF domains were then generated by positioning prism 
elements, each to be subsequently divided into three tetrahedral elements, between 
the brain and the skull surface. The total number of tetrahedral elements in the 
head mesh is 108 914. The electric conductivities of the domains are given in Table 
[l] These values are as in [32^ . 
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Figure 6. Sagittal, coronal, and axial projections (left to right) of 
the head model (top row) and the conductivity distribution inside 
it (bottom row). The electric field is measured using 31 contact 
electrodes marked by dark grey surface patches. The 31 mag- 
netic field measurement locations are indicated by the lighter dark 
spheres over the patches. 




Figure 7. Sagittal, coronal, and axial projections (left to right) of 
the reference current density (top row) and of the region of interest 
(bottom row). 



Lowest order i^(div)-conforming Raviart-Thomas elements were applied for the 
source current density. In these elements, basis functions are linear over the tetra- 
hedron, vanish at one of the vertices, and have a constant direction normal to the 
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Figure 8. Sagittal, coronal, and axial projections (left to right) 
of the MAP estimate from EEG data of the current for the gamma 
hypermodel (top row) and of the inverse gamma hypermodel (bot- 
tom row). To improve the readability of the three dimensional 
plots, only the current elements whose amplitude is above 5% of 
the maximum of the amplitudes in the estimate were plotted. 




Figure 9. Sagittal, coronal, and axial projections (left to right) 
of the CM estimate from EEG data of the current for the gamma 
hypermodel (top row) and for the inverse gamma hypermodel (bot- 
tom row). 



face opposite to the vertex where they vanish [3]. For a curl-free current density, e.g. 
a dipole source, it is necessary to use i7(div)- conforming elements [18 , i.e. elements 
in which basis functions and their divergences are square integrable. Basis functions 
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Figure 10. Sagittal, coronal, and axial projections (left to right) 
of the MAP estimate from MEG data of the current for the gamma 
hypermodel (top row) and for the inverse gamma hypermodel (bot- 
tom row). 




Figure 11. Sagittal, coronal, and axial projections (left to right) 
of the CM estimate from MEG data of the current for the gamma 
hypermodel (top row) and for the inverse gamma hypermodel (bot- 
tom row). 



for iJ(div) -conforming elements can be interpreted to represent dipole-like currents 

m- 

The electric potential u was modelled using Lagrange elements [3 . Quadratic 
Lagrange elements and a fifteen point Gauss quadrature rule [3] were employed to 
generate simulated, noiseless reference data, while the exploration of the posterior 
was done using linear Lagrange elements and a four point Gauss quadrature to 
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Figure 12. Sagittal, coronal, and axial projections (left to right) 
of the MAP estimate from EEG data of the variance of the prior 
for the gamma hypermodel (top row) and for the inverse gamma 
hypermodel (bottom row). In the top row, location of the compo- 
nents of larger than 5% of the maximum value are marked with a 
bubble. In the bottom row only elements whose values were greater 
than or equal to 85% of the maximum were plotted, indicating that 
the estimated variance is nearly maximal over the entire brain. 




Figure 13. Sagittal, coronal, and axial projections (left to right) 
of the CM estimate from EEG data of the variance of the prior 
for the gamma hypermodel (top row) and for the inverse gamma 
hypermodel (bottom row). The thresholding level is set to 5% in 
this plot. 



speed up the computation. The resulting forward modelling error for the electric 
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Figure 14. Sagittal, coronal, and axial projections (left to right) 
of the MAP estimate from MEG data of the variance of the prior for 
the gamma hypermodel (top row) and inverse gamma hypermodel 
(bottom row). The thresholding in the visualization is as in Fig- 
ure [H 




Figure 15. Sagittal, coronal, and axial projections (left to right) 
of the CM estimate from MEG data of the variance of the prior 
for the gamma hypermodel (top row) and for the inverse gamma 
hypermodel (bottom row). Here, the visualization threshold is set 
to 5%. 



lead field matrix in the brain domain was approximately 3 % in the Frobenius norm 
[3], and approximately 2 % for the magnetic lead field matrix. 

The reference current density used to generate the reference data, shown in 
Figure [7| corresponds to the case where we have three dipole-like source currents, 
one positioned deeply, approximately 2.5 cm under the occipital lobe, and the other 
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Figure 16. Sagittal, coronal, and axial projections (left to right) 
of the sample based estimates of the posterior variance from EEG 
data for the gamma hypermodel (top row) and for the inverse 
gamma hypermodel (bottom row). The visualization threshold is 
set at 5% here. 




Figure 17. Sagittal, coronal, and axial projections (left to right) 
of the sample based estimates of the posterior variance from MEG 
data for the gamma hypermodel (top row) and for the inverse 
gamma hypermodel (bottom row). The visualization threshold is 
set to 5%. 



two on the surface layer modelling the cortex. The source on the right of the frontal 
lobe is almost tangential to the surface, and the source close to the left central sulcus 
is almost normal to the surface. 
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Figure 18. Sample histories of the component corresponding to 
the maximum of the posterior mean estimate of the current esti- 
mate from the EEG data and the gamma hypermodel (top left) 
and with the inverse gamma hypermodel (bottom left). In the 
right column we show the corresponding plots obtained from the 
MEG data and the gamma distribution (top right) or the inverse 
gamma distribution (bottom right). 



As in the planar geometry example, the standard deviation in the likelihood 
model is assumed to be 5% of the maximum noiseless signal. In this case, to 
simulate real situations, the noise is added to the generated data. 

The MAP estimate with the IAS algorithm and posterior mean by MCMC sam- 
pling over the ROI are computed, corresponding to two different hyperprior models 
(gamma, inverse gamma) and to the EEG and MEG recording modalities. In these 
examples, the scaling and shape parameter values are held constant at = 10~^ 
and P = 1.55 and up to 20 iterations of the IAS methods were allowed to compute 
the MAP estimates. The ROI consisted of three disjoint sets containing together 
5 982 elements, see Figure [7| Corresponding to each combination of hyperprior 
model and recording modality, an MCMC sample of size M = 50 000 is generated, 
assuming that impressed currents differ from zero only inside the ROI. 

In Figures |8] - 19) the MAP and posterior mean estimates of the current vector 
a using the EEG data are superimposed with three different projections (sagittal, 
coronal, and axial) of the brain. The top row corresponds to the gamma hyperprior 
and the bottom row to inverse gamma. Interestingly, the MAP estimate with the 
gamma distribution is more focal than with the inverse gamma, while with the 
posterior mean, the inverse gamma yields more focal estimates. The same behavior 



is also seen in the estimates based on the MEG data, see Figures 10-11 



Although focal, the estimate obtained from the EEG data is unable to locate 
all the sources. The preference for finding the right cortical source while missing 



CONDITIONALLY GAUSSIAN HYPERMODELS FOR CEREBRAL SOURCES 



25 



the left cortical source may be related to the orientation of the dipoles and the 
positioning of the electrodes in the simulation. 

Unlike in the case of the planar geometry, when using a realistic head model 
the deep source in the MAP estimates is not driven to the surface, indicating that 
there is enough geometric information in the three dimensional positioning of the 
electrodes or magnetometers to resolve the depth. 

The MAP and posterior mean estimates of the variance vector 9 of the prior are 



shown in Figures 12 - 13 using the EEG data and in Figures [14]- 15 for the MEG 



data. The results are in full agreement with the corresponding estimates for the 
currents. 

Finally, based on the MCMC sampling, we estimate the posterior variances 



Var(a) of the current vector. The estimated variances are visualized in Figures 16 



- [T7j Interestingly, even when the posterior estimates are blurry, the posterior 
variances are relatively focal, giving additional information of the likely positions 
of the sources. 

To monitor the performance of the MCMC sampler, in Figure 18 we have plotted 
the sample histories of the components of a and corresponding to the index of 
maximum value of the conditional mean estimate. In the realistic geometry, we do 
not encounter effects that would suggest multimodality of the distribution, and the 
mixing and convergence seem, by visual inspection, satisfactory. 



8. Discussion 

In this article, we consider the MEG/EEG inverse problems of localizing few 
focal sources in the framework of Bayesian hierarchical models. It is shown that 
by using a conditionally Gaussian prior combined with generalized gamma distri- 
butions as hyperpriors, a rich family of posterior densities ensues, and it is possible 
to generate numerous estimates, some of which are closely related to previously 
proposed, regularization based estimators. 

We propose a simple and effective numerical algorithm, the Iterative Alternating 
Sequential (IAS) algorithm for computing the MAP estimate simultaneously for the 
current density and its variance. The versatility of this approach is confirmed by its 
ability of producing an efficient fixed point implementation of several well known fo- 
cal reconstruction methods based on non-quadratic penalties or non-Gaussian prior 
distributions. Particular instances, that correspond to a choice of few scalar param- 
eters in the hypermodel, include the minimum current estimate, the ^^-regularized 
estimate and the limited support functional estimate. Furthermore, the efficient 
numerical implementation using iterative solvers gives also a natural interpreta- 
tion in the statistical framework for the FOCUSS algorithm when applied to the 
MEG/EEG imaging problem. Compared to the empirical Bayes methods proposed 
in the literature such as evidence maximization, or Automatic Relevance Determi- 
nation (ARD), Expectation Maximization (EM) and variational methods [26l [33 ] . 
the IAS algorithm is very fast, easy to implement and does not require sophisti- 
cated minimization methods, nor does it lead to intractable integral expressions. 
As shown in this article, explicit expressions for the iterative minimization steps 
can be found with numerous choices of the hyperprior parameters without requiring 
conjugacy property of the hyperprior. 

The different choices of the hypermodel parameter lead to algorithms that be- 
have qualitatively similarly with respect to the MAP estimation: when the depth 
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resolution due to the measurement geometry is poor, as is the case in the local 
half space model, superficial focal sources are localized well, while deep sources 
are biased towards the surface. The MCMC analysis of the posterior distributions, 
however, reveals a qualitative difference between the hypermodels with different 
values of the hypermodel parameter r. In the case where r = 1, corresponding to 
the gamma hypermodel, the MAP estimate coincides with the minimum current 
estimate and, like the posterior mean estimate, is biased towards superficial sources. 
With r = — 1, yielding the inverse gamma hypermodel, the MAP estimate is also 
biased towards superficial sources, but the posterior mean is not. This qualitative 
difference may be interpreted by saying that the smaller the parameter r, the less 
Gaussian the model, and in the limit as r ^ oo we obtain the minimum norm 
solution. Note that for Gaussian distributions, the posterior mean and the MAP 
estimates coincide. 

From the point of view of Bayesian modelling paradigm, the qualitatively differ- 
ent behavior of solutions corresponding to different hyperpriors may seem strange, 
since in none of the cases any a priori preference to superficial sources is given. 
The explanation is related to the parameter values in the hyperpriors. The scal- 
ing parameter Oq in the gamma distribution was chosen very small in our example 
with planar geometry to obtain good localization in the tangential direction. Since 
the mean of the gamma distribution is 0o/3, this choice of favors small current 
dipoles in the conditional mean, and the energetically easiest way to achieve this is 
to place all the currents on the surface. The mean of the inverse gamma distribution, 
0o/{(3 — 1)^ is also a small number for this choice of Oq^ but since this distribution 
allows significantly larger outliers, it has no difficulty in letting few large dipoles 
in the lower layers explain the data. For completeness, we also performed MCMC 
runs with the gamma distribution with considerably larger scaling parameter val- 
ues. The result suggest that when the scaling parameter is large enough to allow 
dipoles of the correct size in the lower layers where the true dipole lies, the focal 
properties of the posterior mean are lost. We conclude that the findings are in line 
with the Bayesian philosophy and seem to suggest that the inverse gamma prior is 
more suitable for deep sources. 

When using a three-dimensional more realistic geometry, the question of depth 
resolution becomes more delicate since the geometry starts to play a significant 
role, and measurements from different directions possibly contribute to the depth 
resolution more than the prior. 

The computed results obtained with the realistic head model are in good agree- 
ment with the planar case results, suggesting that the posterior mean estimate is 
most effective in combination with the inverse gamma prior. The MAP estimate, 
on the other hand, is most effective when applied in connection with the gamma 
prior. In general, for the inverse gamma hypermodel the posterior mean estimation 
produced better results with larger values of the scaling parameter Oq than MAP 
estimation, while for the gamma hypermodel, MAP estimates were superior to CM 
estimates with larger Oq values. In these model both estimate types were found to 
be more sensitive to the choice of the shape parameter P than in the case of the 
planar geometry, with large value of f3 leading to blurred estimates and ultimately 
to invisibility of the deep source current. The hyperparameter values = 10~^ 
and P = 1.55, used in the realistic case, were chosen based on visual inspection. 
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Future extensions of this work include a hierarchical extension of the model where 
the values of the hypermodel parameters will be chosen based on the data, and the 
extension of the formalism to include time dependent sources with a longitudinal 
correlation structure. 



9. Appendix: Complete electrode model and FEM for EEG/MEG 

This appendix describes briefly how the electric and magnetic lead field matrices 
can be constructed through the complete electrode model [28 and the FEM for a 
realistic three dimensional head, denoted here by Q. 

The complete electrode model assumes that a set of contact electrodes 61,62, 
. . . , 6l with contact impedances 2^1, ^2,. . . , is attached to the boundary dft. The 
electrode potential values are collected into a vector U = (/7i,/72,...,/7l) and the 
electric potential field u satisfies the equation 



(9) 



V • {a\/u) = V • J, 



as well as the boundary conditions 



(10) 



du 
- — 

dn 



J eo 



du 
dn 



ds = 0, 



dn) 



with i = 1, 2, . . . , L. Additionally, the Kirchoff's voltage law Xl^^i = is as- 
sumed to hold. The weak form of ([9| and (10) can be formulated by requiring that 
u e H^{n) = {w e L^{Q) : dw/dri e LJ{Q), i = 1, 2, 3 } and J G H(div; Q) = 
{we L^{n)^ : V • w G }. These function spaces are thoroughly discussed 

e.g. in [18]. 

The finite element discretized fields corresponding to u e H^{rt) and G 



H{diY;ft) can be defined as uj- 



and Jr = Xli=i<^iW^, respectively. 



Here, the functions ?/^27 • • • , i^N^ ^ H^{^) and wi, W2, . . . , w^Tj G i^(div; Q) are, 
respectively, scalar and vector valued finite element basis functions, defined on a 
shape regular finite element mesh T [3 , and the coefficients form the coordinate 
vectors ( = {(1X2^- • • , Cn^)^ and a = (ai, 0^2,. . . , a^j)^ - Furthermore, since the 
sum of the electrode potentials is assumed to be zero in the complete electrode 
model, it is required that U = {Ui^U2^- • • ^Ul)^ = RC,^ where is a vector and 
R G R^><(^~^) is a matrix with entries Rij = -Rj^ij = 1 for j = 1, 2, . . . , L - 1, 
and otherwise Rij = 0. 

The vectors a, ( and ( are linked through a symmetric and positive definite linear 
system of the form 



(11) 



B 


C ' 




' C " 




" Fa " 




G 




. c . 
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in which the submatrix entries are given by 

Jn 



Bij — I (jV^lJi • Vipj dVt + \^ — / ^i^j dS, 

Cij = -— [ (pidS^ ^ / (fidS, 

Gij = - [ dS^^ [ dS, 



where 5ij denotes the Kronecker delta. The system (11) arises from the Ritz- 
Galerkin discretization [3 of the weak form of ^ and (10). Similarly, a discretized 
version of the Biot-Savart law ([2| can be expressed as 5 = Wa — VC,^ where 5 is a 
vector containing the magnetic field values at the measurement locations, and the 
matrices are defined by 

TJ7 [ ■ w^- X (r^ - r) 

^i,30-i)+fc = _^|3 

T . f ^k- crVipj X (r^ - r) 

Vi^^(^_^^^_k. = / ^ 7^ <^r, 

Jn 



^i,3(j-l)+/c 

JQ 

with Tj denoting the jth measurement location and ek denoting the kth natural 
basis vector. 

The dependences of U and B on the vector a are described by the electric and 
magnetic lead field matrices and M"^, respectively. These matrices are given 
by 

= R{C^B-^C -G)-^C^B-^F, 
M"" = W -V{B -CG-^C^)-^F, 

as can be verified through straightforward linear algebra manipulations. Note that 
these expressions are valid only if a set of contact electrodes is attached to the head 
during the magnetic field measurement. 
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